AITopics | safety test

Collaborating Authors

safety test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

680390c55bbd9ce416d1d69a9ab4760d-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 17:46:21 GMT

future performance, noise, pseudo-variable, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.48)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.95)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

680390c55bbd9ce416d1d69a9ab4760d-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 17:46:14 GMT

algorithm, assumption, international conference, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SecurityAnalysisofSafeandSeldonian ReinforcementLearningAlgorithms

Neural Information Processing SystemsFeb-8-2026, 17:04:40 GMT

This component makes current Seldonian algorithms safe: the safety test checks whether necessary safety constraints are satisfiedwithhighprobability.

machine learning, reinforcement learning, trajectory, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.96)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

680390c55bbd9ce416d1d69a9ab4760d-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 03:17:41 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(3 more...)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

65ae450c5536606c266f49f1c08321f2-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 02:43:27 GMT

machine learning, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
Europe (0.04)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government > Military (0.69)
Government > Regional Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

ChatGPT offered bomb recipes and hacking tips during safety tests

The GuardianAug-28-2025, 19:04:01 GMT

A ChatGPT model gave researchers detailed instructions on how to bomb a sports venue – including weak points at specific arenas, explosives recipes and advice on covering tracks – according to safety testing carried out this summer. OpenAI's GPT-4.1 also detailed how to weaponise anthrax and how to make two types of illegal drugs. The testing was part of an unusual collaboration between OpenAI, the 500bn artificial intelligence start-up led by Sam Altman, and rival company Anthropic, founded by experts who left OpenAI over safety fears. Each company tested the other's models by pushing them to help with dangerous tasks. The testing is not a direct reflection of how the models behave in public use, when additional safety filters apply.

large language model, machine learning, natural language, (13 more...)

The Guardian

Country: Asia > North Korea (0.06)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.73)

Add feedback

Review for NeurIPS paper: Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms

Neural Information Processing SystemsJan-25-2025, 03:53:40 GMT

Weaknesses: W1: The study seems to focus too much on algorithms that are based on safety tests. I understand that the analysis is not compatible, but maybe that would be worth it to include studies on how easy it is to trick those algorithms too. More generally (even for IS algorithms), it was a bit odd to me that the study does not consider attacks on the way pi_e is chosen. W2: It's unclear to me whether the trajectory must still have been performed in the real environment, or it can be completely be made up (but then its value has to be within the range [0,1]). Also, with model based methods (for both environment and policy models), it might be possible to single out the few trajectories that are inconsistent with the other trajectories.

security analysis, seldonian reinforcement learning algorithm, trajectory, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

CSPI-MT: Calibrated Safe Policy Improvement with Multiple Testing for Threshold Policies

Cho, Brian M, Pop, Ana-Roxana, Gan, Kyra, Corbett-Davies, Sam, Nir, Israel, Evnine, Ariel, Kallus, Nathan

arXiv.org Artificial IntelligenceAug-21-2024

When modifying existing policies in high-risk settings, it is often necessary to ensure with high certainty that the newly proposed policy improves upon a baseline, such as the status quo. In this work, we consider the problem of safe policy improvement, where one only adopts a new policy if it is deemed to be better than the specified baseline with at least pre-specified probability. We focus on threshold policies, a ubiquitous class of policies with applications in economics, healthcare, and digital advertising. Existing methods rely on potentially underpowered safety checks and limit the opportunities for finding safe improvements, so too often they must revert to the baseline to maintain safety. We overcome these issues by leveraging the most powerful safety test in the asymptotic regime and allowing for multiple candidates to be tested for improvement over the baseline. We show that in adversarial settings, our approach controls the rate of adopting a policy worse than the baseline to the pre-specified error level, even in moderate sample sizes. We present CSPI and CSPI-MT, two novel heuristics for selecting cutoff(s) to maximize the policy improvement from baseline. We demonstrate through both synthetic and external datasets that our approaches improve both the detection rates of safe policies and the realized improvement, particularly under stringent safety requirements and low signal-to-noise conditions.

baseline, cspi-mt, cutoff, (15 more...)

arXiv.org Artificial Intelligence

2408.12004

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Israel (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Warning that robot lawnmowers are killing hedgehogs: Scientists propose must-have garden gadgets come with 'safety certificates'

Daily Mail - Science & techJan-15-2024, 16:10:19 GMT

Hedgehogs are increasingly being killed and injured from encounters with robot lawnmowers which have few safety features to protect wildlife, according to Oxford University scientists. Researchers conducted a series of tests with the mowers, the latest must-have garden gadget, with a view to create a'hedgehog friendly' certification so gardeners need not fear any prickly casualties when they trim the grass. To ensure no harm was caused to living hedgehogs, scientists used rubber'crash test hedgehogs' instead to see if the robot mower would turn away on encountering one of Mrs Tiggywinkle's tribe on the lawn. Hedgehogs are already in serious decline, with reasons including habitat loss, road traffic accidents, intensive agriculture, and injuries from dog bites and garden strimmers. But now mowers are adding to the threats.

hedgehog, lawnmower, robotic lawnmower, (9 more...)

Daily Mail - Science & tech

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.39)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback